perm filename V2N.OUT[TEX,DEK] blob sn#381029 filedate 1978-09-17 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00009 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	%folio 621 galley 1 (C) Addison-Wesley 1978
C00015 00003	%folio 624 galley 2 (C) Addison-Wesley 1978
C00030 00004	%folio 629 galley 3 (C) Addison-Wesley 1978
C00041 00005	%folio 632 galley 4 (C) Addison-Wesley 1978
C00053 00006	%folio 635 galley 5 (C) Addison-Wesley 1978
C00065 00007	%folio 638 galley 6 (C) Addison-Wesley 1978
C00074 00008	%folio 643 galley 7 (C) Addison-Wesley 1978
C00087 00009	
C00088 ENDMK
C⊗;
%folio 621 galley 1 (C) Addison-Wesley 1978
|newtype 58320---Computer Programming\quad (A.-W./Knuth)\quad
ch. 4\quad f. 621\quad g. 1D

|qleft \12skip \quad An even better scheme for large $n,$ discovered
by Volker Strassen in 1968, is based on the fact that the product
of 2 \times 2 matrices can be evaluated with only 7 multiplications,
without relying on the commutativity of multiplication as in
(35). Therefore 2$n \times 2n$ matrices can be partitioned into
four $n \times n$ matrices and the idea can be used recursively
to obtain the product of $2↑k \times 2↑k$ matrices with only
7$↑k$ multiplications instead of $(2↑k)↑3 = 8↑k.$ Strassen's
original 2 \times 2 identity {\sl [Numer. Math. \bf 13} (1969),
354--356] used 7 multiplications and 18 additions; S. Winograd
later discovered the following more economical formula:
$$
|qleft ??(a\quad b??5c\quad d??}\left(${A\quad C\atop B\quad
D}$} = \left(
|qleft \-26skip ⊗aA + bB⊗w + (c + d)(C - A) + (a + b - c - d)D\cr
\2skip ⊗w + (a - c)(D - C) - d(A - B - C + D)⊗w + (a - c)(D
- C) + (c + d)(C - A)\cr
\4skip (36)
|qright \6skip where $w = aA - (a - c - d)(A - C + D).$ If intermediate
results are appropriately saved, (36) involves 7 multiplications
and only 15 additions; by induction on $k,$ we can multiply
$2↑k \times 2↑k$ matrices with $7↑k$ matrices with $7↑k$ multiplications
and $5(7↑k - 4↑k)$ additions. The total number of operations
needed to multiply $n \times n$ matrices has therefore been
reduced from order $n↑3$ to $O(n$$↑{lg7}) = O(n↑{2.8074}).$
Strassen showed that $A$ similar reduction applies also to the
evaluation of determinants and matrix inverses; cf. J. R. Bunch
and J. E. Hopcroft, {\sl Math. Comp. \bf 28} (1974), 231--236.
|qleft \quad These theoretical results are quite striking, but
from a practical standpoint they are of limited use because
$n$ must be very large before we overcome the effect of additional
bookkeeping costs. Richard Brent [Stanford Computer Science
report CS157 (March, 1970), see also {\sl Numer. Math. \bf 16}
(1970), 145--156] found that a careful implementation of Winograd's
scheme (35), with appropriate scaling for numerical stability,
became better than the conventional scheme only when $n ≥ 40,$
and it saved 7 percent of the running time when $n = 100.$ For
complex arithmetic the situation was somewhat different; (35)
became advantageous for $n > 20,$ and saved 18 percent when
$n = 100.$ He estimated that Strassen's scheme would not begin
to excel more than 60,000 entries, rarely occur in practice
(unless they are very sparse, when other techniques apply).
|qleft \quad By contrast, the methods we shall discuss next
are eminently practical nd have found wide use. The $finite
Fourier transform f$ of a complex-valued function $F$ of $n$
variables, over domains of $m↓1, \ldots , m↓n$ elements, is
defined by the equation
$$$f(s↓1, \ldots , s↓n) = \sum ↓{0≤t↓1<m↓1??\ldots . . ??0≤t↓n<m↓n}$
exp\left(2πi\left(${s↓1t↓1\over m↓1}$ +\cdot \cdot \cdot + ${s↓nt↓n\over
m↓n}$}}F(t$↓1, \ldots , t↓n)\eqno (37)$
|qctr \6skip for $o ≤ s↓1 < m↓1, \ldots , 0 ≤ s↓n < m↓n;$ the
name ``transform'' is justified because we can recover the values
$F(t↓1, \ldots , t↓n)$ from the values $f(s↓1, \ldots , s↓n),$
as shown in exercise 13. In the important special case that
all $m↓j = 2,$ we have
$$$f(s↓1, \ldots , s↓n) = \sum ↓{0≤t↓1, \ldots , t↓n≤1} (-1)↑{s↓1t↓1
+\cdot \cdot \cdot + s↓nt↓n}F(t↓1, \ldots , t↓n)\eqno (38)$
|qctr \6skip for $0 ≤ s↓1, \ldots , s↓n ≤ 1,$ and this may be
regarded as a simultaneous evaluation of $2↑n$ linear polynomials
in 2$↑n$ variables $F(t↓1, \ldots , t↓n).$ A well-known technique
due to F. Yates {\sl The Design and Analysis of Factorial Experiments
(}Harpenden: Imperial Bureau of Soil Sciences, 1937)] can be
used to reduce the number of additions implied in (38) from
$2↑n(2↑n - 1)$ to $n2↑n.$ Yate's method can be understood by
considering the case $n = 3:$ Let $x↓{t↓1t↓2t↓3} = F(t↓1, t↓2,
t↓3).$

|qleft \6skip |newtype
 |qleft Given⊗First step\qquad ⊗Second step\qquad ⊗Third step\cr
\2skip $x↓{000}\qquad x↓{000} + x↓{001}\qquad x↓{000} + x↓{001}
+ x↓{010} + x↓{011}\qquad x↓{000} + x↓{001} + x↓{010} + x↓{011}
+ x↓{100} + x↓{101} + x↓{110} + x↓$
{111}|qleft x$↓{001}\qquad x↓{010} + x↓{011}\qquad x↓{100} +
x↓{101} + x↓{110} + x↓{111}\qquad x↓{000} - x↓{001} + x↓{010}
- x↓{011} + x↓{100} - x↓{101} + x↓{110} - x↓$
{111}|qleft x$↓{010}\qquad x↓{100} + x↓{101}\qquad x↓{000} -
x↓{001} + x↓{010} - x↓{011}\qquad x↓{000} + x↓{001} - x↓{010}
- x↓{011} + x↓{100} + x↓{101} - x↓{110} - x↓$
{111}|qleft x$↓{011}\qquad x↓{110} + x↓{111}\qquad x↓{100} -
x↓{101} + x↓{110} - x↓{111}\qquad x↓{000} - x↓{001} - x↓{010}
+ x↓{011} + x↓{100} - x↓{101} - x↓{110} + x↓$
{111}|qleft x$↓{100}\qquad x↓{000} - x↓{001}\qquad x↓{000} +
x↓{001} - x↓{010} - x↓{011}\qquad x↓{000} + x↓{001} + x↓{010}
+ x↓{011} - x↓{100} - x↓{101} - x↓{110} - x↓$
{111}|qleft x$↓{101}\qquad x↓{010} - x↓{011}\qquad x↓{100} +
x↓{101} - x↓{110} - x↓{111}\qquad x↓{000} - x↓{001} + x↓{010}
- x↓{011} - x↓{100} + x↓{101} - x↓{110} + x↓$
{111}|qleft x$↓{110}\qquad x↓{100} - x↓{101}\qquad x↓{000} -
x↓{001} - x↓{010} + x↓{011}\qquad x↓{000} + x↓{001} - x↓{010}
- x↓{011} - x↓{100} - x↓{101} + x↓{110} + x↓$
{111}|qleft x$↓{111}\qquad x↓{110} - x↓{111}\qquad x↓{100} -
x↓{101} - x↓{110} + x↓{111}\qquad x↓{000} - x↓{001} - x↓{010}
+ x↓{011} - x↓{100} + x↓{101} + x↓{110} - x↓$
{111}$$|newtype To get from the ``Given'' to the ``First step''
requires four additions and four subtractions; and the interesting
feature of Yates's method is that exactly the same transformation
which takes us from ``Given'' to ``First step'' will take us
from ``First step'' to ``Second step'' and from ``Second step''
to ``Third step.'' In each case we do four additions, then four
subtractions; and after three steps we have the desired Fourier
transform $f(s↓1, s↓2, s↓3)$ in the place originally occupied
by $F(s↓1, s↓2, s↓3)!$
|qleft \quad This special case is often called the {\sl Walsh
transform} of 2$↑n$ data elements, since the corresponding pattern
of signs was studied by J. L. Walsh {\sl [Amer. J. Math. \bf 45}
(1923), 5--24]. Note that the number of sign changes from left
to right in the ``Third step'' above assumes the respective
values 0, 7, 3, 4, 1, 6, 2, 5. Walsh observed that there will
be exactly 0, 1, \ldots , 2$↑n - 1$ sign changes in some order
in the general case, so the coefficients provide discrete approximations
to sine waves with various frequencies. (See H. F. Harmuth,
{\sl IEEE Spectrum \bf 6,} 11 (Nov. 1969), 82--91 for applications
of this property, and see Section 7.2.1 for further discussion
of the Walsh coefficients.)
|qleft \quad Yates's method can be generalized to the evaluation
of any finite Fourier transform, and, in fact, to the evaluation
of any sum which can be written
$$$f(s↓1, s↓2, \ldots , s↓n) = \sum ↓{0≤t↓1<m↓1??\ldots . ??0≤t↓n<m↓n}
g↓1(s↓1, s↓2, \ldots , s↓n t↓1)g↓2(s↓2, \ldots , s↓n, t↓2) \cdot
\cdot \cdot$

|qleft  \4skip g$↓n(s↓n, t↓n)F(t↓1, t↓2, \ldots , t↓n), 0 ≤ s↓j
< m↓j,\quad (29)$
|qright \6skip for given functions $g↓j(s↓j, \ldots , s↓n, t↓j).$
Proceed as follows:
$$
|qleft \hfill f$↑{[0]}(t↓1, t↓2, t↓3, \ldots , t↓n) ⊗= F(t↓1,
t↓2, t↓3, \ldots , t↓n);\cr
\6skip \hfill f↑{[1]}(s↓n, t↓1, t↓2, \ldots , t↓{n-1}) ⊗= \sum
↓{0≤t↓n<m↓n} g↓n(s↓n, t↓n)f↑{[0]}(t↓1, t↓2, \ldots , t↓n);$

|qleft \6skip \hfill f$↑{[2]}(s↓{n-1}, s↓n, t↓1, \ldots , t↓{n-2})
⊗= \sum ↓{0≤t↓{n-1}<m↓{n-1}} g↓{n-1}(s↓{n-1}, s↓n, t↓{n-1})f↑{[1]}(s↓n,
t↓1, \ldots , t↓{n-1});$

|qleft \6skip \quad \cdot \cdot \cdot
 |qleft \hfill f$↑{[n]}(s↓1, s↓2, s↓3, \ldots , s↓n) ⊗= \sum
↓{0≤t↓1<m↓1} g↓1(s↓1, \ldots , s↓n, t↓1)f↑{[n-1]}(s↓2, s↓3,
\ldots , s↓n, t↓1);$

|qleft \6skip \hfill \hfill f(s$↓1, s↓2, s↓3, \ldots , s↓n)
⊗= f↑{[n]}(s↓1, s↓2, s↓3, \ldots , s↓n).\eqno (40)\cr
\6skip β???|newtype 58320---Computer$
%folio 624 galley 2 (C) Addison-Wesley 1978
Programming\quad (A.-W./Knuth)\quad ch. 4\quad f. 624\quad g.
2D

|qleft \12skip For Yate's method as shown above, $g↓j(s↓j, \ldots
, s↓n, t↓j) = (-1)↑s|infsup j↑t|infsup j; f↑{[0]}(t↓1, t↓2,
t↓3)$ represents the ``Given''; $f↑{[1]}(s↓3, t↓1, t↓2)$ represents
the ``First step''; etc. Whenever a sum can be put into the
form of (39), for reasonably simple functions $g↓j(s↓j, \ldots
, s↓n, t↓j),$ the scheme (40) will reduce the amount of computation
from order $N↑2$ to rougly the order $N$ log $N,$ where $N =
m↓1, \ldots m↓n;$ furthermore this scheme is ideally suited
to parallel computation. The important special case of one-dimensional
Fourier transforms is discussed in exercise 14.
|qleft \quad Let us consider one more special case of polynomial
evaluation. {\sl Lagrange's interpolation polynomial} of order
$n,$ which we shall write as
$$
|qctr \hfill u$↑{[n]}(x) ⊗= y↓0 {(x - x↓1)(x - x↓2) \cdots (x
- x↓n)\over (x↓0 - x↓1)(x↓0 - x↓2) \cdots (x↓0 - x↓n)}\cr
\4skip ⊗ + y↓1 {(x - x↓0)(x - x↓2) \cdots (x - x↓n)\over (x↓1
- x↓0)(x↓1 - x↓2) \cdots (x↓1 - x↓n)} + \cdot \cdot \cdot \cr
\4skip ⊗ + y↓n {(x - x↓0)(x - x↓1) \cdots (x - x↓{n-1})\over
(x↓n - x↓0)(x↓n - x↓1) \cdots (x↓n - x↓{n-1})} ,\eqno (41)\cr
\6skip$ is the only polynomial of degree $≤n$ in $x$ which takes
on the respective values $y↓0, y↓1, \ldots , y↓n$ at the $n
+ 1$ distinct points $x = x↓0, x↓1, \ldots ,x↓n.$ (For it is
evident from (41) that $u↑{[n]}(x↓k) = y↓k$ for $0 ≤ k ≤ n.$
If $f(x)$ is any such polynomial of degree $≤n,$ then $g(x)
= f(x) - u↑{[n]}(x)$ is of degree $≤n,$ and $g(x)$ is zero for
$x = x↓0, x↓1, \ldots , x↓n;$ therefore $g(x)$ is a multiple
of the polynomial $(x - x↓0)(x - x↓1) \cdots (x - x↓n).$ The
degree of the latter polynomial is greater than $n,$ so $g(x)
= 0.)$ If we assume that the values of a function in some table
are well-approximated by a polynomial, Lagrange's formula (41)
may there fore be used to ``interpolate'' for values of the
function at points $x$ not appearing in the table. Unfortunately,
there seem to be quite a few additions, subtractions, multiplications,
and divisions in Lagrange's formula; in fact, there are $n$
additions, $2n↑2 + 2$ subtractions, $2n↑2 + n - 1$ multiplications,
and $n + 1$ divisions. Fortunately (as we might suspect), improvement
is possible.
|qleft \quad The basic idea for simplifying (4) is to note that
$u↑{[n]}(x) - u↑{[n-1]}(x)$ is zero for $x = x↓0, \ldots , x↓{n-1};$
thus $u↑{[n]}(x) - u↑{[n-1]}(x)$ is a polynomial of degree $≤n$
and a multiple of $(x - x↓0) \cdots (x - x↓{n-1}).$ We conclude
that $u↑{[n]}(x) = α↓n(x - x↓0) \cdots (x - x↓{n-1}) + u↑{[n-1]}(x),$
where $α↓n$ is a constant. This leads us to {\sl Newton's interpolation
formula\6skip u↑{[n]}(x) = |tab α↓n(x - x↓0)(x - x↓1) \cdots
(x - x↓{n-1}) + \cdot \cdot \cdot ⊗\4skip ⊗+ α↓2(x - x↓0)(x
- x↓1) + α↓1(x - x↓0) + α↓0,\eqno (42)\cr
\6skip} where the $α'$s are some constants we should like to
determine from $x↓0, x↓1, \ldots , x↓n, y↓0, y↓1, \ldots , y↓n.$
Note that this formula holds for all $n; α↓k$ does not depend
on $x↓{k+1}, \ldots , x↓n, y↓{k+1}, \ldots , y↓n.$ Once the
$α'$s are known, Newton's interpolation formula is conveneient
for calculation, since we may generalize Horner's rule once
again and write
$$$u↑{[n]}(x) = \biglp (\cdots (α↓n(x - x↓{n-1}) + α↓{n-1})(x
- x↓{n-2}) +\cdot \cdot \cdot )(x - x↓0) + α↓0\bigrp .$
|qctr \4skip (43)
|qright \6skip This requires $n$ multiplications and $2n$ additions.
Alternatively, we may evaluate eachof the individual terms of
(42) from right to left; with $2n - 1$ multiplications and $2n$
additions we thereby calculate all of the values $u↑{[0]}(x),
u↑{[1]}(x), \ldots , u↑{[n]}(x),$ and this indicates whether
or not the interpolation process is ``converging.''
|qleft \quad The $α'$s in Newton's formula may be found by computing
the ``divided differences'' in the following tableau (shown
for $n = 3):$
$$}h10l7}
|qctr ⊗y$↓0\cr
⊗⊗(y↓1 - y↓0)/(x↓1 - x↓0) = y↑{↑\prime}↓{1}\cr
⊗y↓1⊗⊗(y↑{↑\prime}↓{2} - y↑{↑\prime}↓{1})/(x↓2 - x↓0) = y↓2\cr
⊗⊗(y↓2 - y↓1)/(x↓2 - x↓1) = y↑{↑\prime}↓{2}⊗⊗(y↓3 - y↓2)/(x↓3
- x↓0) = y↓3\cr
⊗y↓2⊗⊗(y↑{↑\prime}↓{3} - y↑{↑\prime}↓{2})/(x↓3 - x↓1) = y↓3\cr
⊗⊗(y↓3 - y↓2)/(x↓3 - x↓2) = y↓3\cr
⊗y↓3\cr
\-7skip (44)$
|qright \12skip \5skip |newtype It is possible to prove that
$α↓0 = y↓0, α↓1 = y↑{↑\prime}↓{1}, α↓2 = y↓2,$ etc., and that
the divided differences have important relations to the derivatives
of the function being interpolated; see exercise 15. Therefore
the following calculation \biglp corresponding to (44)|newtype
) |newtype may be used to obtain the $α'$s:
$$\quad Start with $(α↓0, α↓1, \ldots , α↓n) ← (y↓0, y↓1, \ldots
, y↓n);$ then, for $k = 1, 2, \ldots , n\cr
\quad ($in this order), set $y↓j ← (y↓j - y↓{j-1})/(x↓j - x↓{j-k})$
for $j = n, n - 1, \ldots , k\cr
\quad$ (in this order).

|qleft \6skip This process requires ${1\over 2}$($n↑2 + n)$
divisions and $n↑2 + n$ subtractions, so about three-fourths
of the work implied in (41) has been saved.
|qleft \quad For example, suppose that we want to estimate ${3\over
2}$! from the values of 0.!, 1!, 2!, and 3!, using a cubic polynomial.
The divided differences are
|qleft
 |qctr ⊗x⊗y⊗y↑\prime ⊗y\dprime ⊗y\cr
\2skip |newtype ⊗0⊗1\cr
⊗⊗⊗0\cr
⊗1⊗1⊗⊗${1\over 2}$\cr
⊗⊗⊗1⊗⊗${1\over 3}$\cr
⊗2⊗2⊗⊗${3\over 2}$\cr
⊗⊗⊗4\cr
⊗3⊗6\cr
\6skip |newtype so $u↑{[0]}(x) = u↑{[1]}(x) = 1, u↑{[2]}(x)
= {1\over 2}x(x - 1) + 1, u↑{[3]}(x) = {1\over 3}x(x - 1)(x
- 2) + {1\over 2}x(x - 1) + 1.$ Setting $x = {3\over 2}$ gives
-${1\over 8}$ + 3??38?? + 1 = 1.25; presumably the ``correct''
value is $\Gamma ({5\over 2}) = {3\over 4}??U20}π?? \approx
1.33.$
|qleft \quad It is instructive to note that evaluation of the
interpolation polynomial is just a special case of the Chinese
remainder algorithm of Section 4.3.2, since we know the value
of $u↑{[n]}(x)$ mod the relatively prime polynomials $x - x↓0,
\ldots , x - x↓n.$ \biglp As we have seen above, $f(x)$ mod
$(x - x↓0) = f(x↓0).|newtype ) |newtype Under this interpretation,
Newton's formula (42)$ is precisely the ``mixed radix representation''
of Eq. 4.3.2--24; and 4.3.2--23 yields another way to compute
$α↓0, \ldots , α↓n$ using the same number of operations as (44).
|qleft \quad By applying fast Fourier transforms, it is possible
to reduce the running time for interpolation to $O\biglp n($log
$n)↑2\bigrp , and a similar reduction can also be made for related
algorithms such as the solution to the Chinese remainder problem
and the evaluation of an n$th degree polynomial at $n$ differenthppo???ee
polynomial at $n$ different points; see E. Horowitz, {\sl Inf.
Proc. Letters \bf 1} (1972), 157--163, R. Moenck and A. Borodin,
{\sl J. Comp. Syst. Sci. \bf 8} (1974), 336--385, and A. Borodin,
{\sl complexity of Sequential and Parallel Numerical algorithms,}
ed. by J. F. Traub (New York: Academic Press, 1973), 149--180).
However, this must be regarded as a purely theoretical possibility
at present, since the known algorithms have a rather large overhead
factor.
|qleft \quad A remarkable modification of the method of divided
differences, whioch applies to rational functions instead of
polynomials, was introduced by T. N. Thiele in 1909. For a discussion
of Thiele's method of ``reciprocal differences,'' see L. M.
Milne-Thompson, {\sl Calculus of Finite Differences (}London:
MacMillan, 1933), chapter 5; R. W. Floyd, {\sl CACM \bf 3} (1960),
508.

|qleft \12skip {\bf For further reading. \xskip} In this section
we have barely scratched the surface of a very large subject
in which many beautiful theories are emerging; a more comprehensive
treatment appears in the book {\sl Computational Complexity
of Algebraic and Numeric Problems} by A. Borodin and I. Munro
(New York: American Elsevier, 1975).

|qleft \24skip {\:r EXERCISES

|qleft }\12skip |newtype {\bf 1. \xskip} $[{\sl OC}]$ What is
a good way to evaluate an ``odd'' polynomial
$$$u(x) = u↓{2n+1}x↑{2n+1} + u↓{2n-1}x↑{2n-1} +\cdot \cdot \cdot
+ u↓1x?$
|qctr \3skip {\bf 2. \xskip} [M{\sl Pc}] Instead of computing
$u(x + x↓0)$ by steps H1, H2 as in the text, discuss the application
of Horner's rule (2) when {\sl polynomial} multiplication and
addition are used instead of arithmetic in the domain of coefficients.

\exno 3. [Pc] Give a method analogous to Horner's rule, for
evaluating a polynomial in two variables $\sum ↓{i+j≤n} u↓{ij}x↑iy↑j.$
(This polynomial has $(n + 1)(n + 2)/2$ coefficients, and ``total
degree'' $n.)$ Count the number of additions and multiplications
you use.

\exno 4. [MPc] The text shows that scheme (3) is superior to
Horner's rule when we are evaluating a polynomial with real
coefficients at a complex point $z.$ Compare (3) to Horner's
rule when {\sl both} the coefficients and the variable $z$ are
complex numbers; how many (real) multiplications and addition-subtractions
are required by each method?

\exno 5. [M20] Count the number of multiplications and additions
required by the second-order rule (4).
|qleft β???|newtype 58320---Comp
%folio 629 galley 3 (C) Addison-Wesley 1978
iter Programming\quad (A.-W./Knuth)\quad ch. 4\quad f. 629\quad
g. 3D
|qleft |newtype {\bf 6. \xskip} [${\sl 22}]$ (L. de Jong and
J. van Leeuwen.) Show how to improve on steps S1, \ldots , S4
by computing only about ${1\over 2}$$n$ powers of $x.$

\exno 7. [M24] How can $β↓0, \ldots , β↓n$ be calculated so
that (6) has the value $u(x↓0 + kh)$ for all $k;$

\exno 8. [M20] The factorial power $x↑k$ is defined to be $k!({x\atop
k}) = x(x - 1) \cdots (x - k + 1).$ Explain how to evaluate
$u↓nx↑n +\cdot \cdot \cdot + u↓1x↑1 + u↓0$ with at most $n$
multiplications and $2n - 1$ additions, starting with $x$ and
the $n + 3$ constants $u↓n, \ldots , u↓0, 1, n - 1.$

\exno 9. [M24] (H. J. Ryser.) Show that if $X = (x↓{ij})$ is
an $n \times n$ matrix, then
$$per($X) = \sum (-1)↑{n-ε↓1- \cdots -ε↓n}\prod ↓{1≤i≤n} \sum
↓{1≤j≤n} ε↓jx↓$
{ij}|qctr \6skip summed over all $2↑n$ choices of $ε↓1, \ldots
, ε↓n$ equal to 0 or 1 independently. count the number of addition
and multiplication operations required to evaluate per $(X)$
by this formula.

\exno 10. [M21] The permanent of an $n \times n$ matrix $X =
(x↓{ij})$ may be calculated as follows: Start with the $n$ quantities
$x↓{11}, x↓{12}, \ldots , x↓{1n}.$ For $1 ≤ k < n,$ assume that
the (${n\atop k})$ quantities $A↓{kS}$ have been computed, for
all $k-$element subsets $S$ of $\{1, 2, \ldots , n\},$ where
$A↓{kS} = \sum x↓{1j}|infinf 1 \ldots x↓{kj}|infinf k$ summed
over all $k!$ permutations $j↓1 \ldots j↓k$ of the elements
of $S;$ then form all of the sums
$$$A↓{(k+1)S} = \sum ↓{j\inS} A↓{k(S-\j\)}x↓{(k+1)j}.$
|qctr \6skip We have per($X) = A↓{n\1,} ↓. ↓. ↓. ↓, ↓{n\}.$
|qleft \qquad How many additions and multiplications does this
method require? How much temporary storage is needed?

\exno 11. [M50] Is there any way to evaluate the permanent of
a general $n \times n$ matrix using a number of operations which
does not grow exponentially with $n?$

\exno 12. [M50] What is the minimum number of multiplications
required to multiply two $n \times n$ matrices? Can the total
number of operations be reduced to less than $O(n$$↑{lg} ↑7)?$

\exno 13. [M23] Find the inverse of the general finite Fourier
transform (37), by expressing $F(t↓1, \ldots , t↓n)$ in terms
of the values of $f(s↓1, \ldots , s↓n). [Hint{\sl :}$ See Eq.
1.2.9--13.]

NOTE: CROSS-REFERENCE TO THIS EX WAS MADE IN ANSWER TO EX 4.6--5, IS IT OK?
\exno 14. [HM28] (a) ({\sl ``Fast Fourier transforms.'')} Show
that the scheme (40) can be used to evaluate the one-dimensional
fourier transform
$$$f(s) = \sum ↓{0≤t<2↑n} F(t)\omega ↑{st},\qquad \omega = e↑{2πi/2↑n},\qquad
0 ≤ s < 2↑n,$
|qctr \6skip using arithmetic on complex numbers. (b) Apply
this result to prove that we can obtain the coefficients of
the product of two given polynomials, $u(z)v(z),$ in $O(n$ log
$n)$ operations of (exact) addition and multiplication of complex
numbers, if $u(z)$ and $v(z)$ are $n$th degree polynomials with
complex coefficients. [{\sl Hint{\sl : }}consider the product
of fourier transforms of the coefficients.]

\exno 15. [HM28] The $n$th {\sl divided difference f(x↓0, x↓1,
\ldots , x↓n)} of a function $f(x)$ at $n + 1$ distinct points
$x↓0, x↓1, \ldots , x↓n$ is defined by the formula $f(x↓0, x↓1,
\ldots , x↓n) = |newtype (|newtype f(x↓0, x↓1, \ldots , x↓{n-1})
- f(x↓1, \ldots , x↓{n-1}, x↓n)|newtype )|newtype /(x↓0 - x↓n),$
for $n > 0.$ Thus $f(x↓0, x↓1, \ldots , x↓n) = \sum ↓{0≤k≤n}
f(x↓k)/π↓{0≤j≤n,j≠k}(x↓k - x↓j)$ is a symmetric function of
its $n + 1$ arguments. (a) Prove that $f(x↓0, \ldots , x↓n)
= f↑{(n)}(\theta )/n!,$ for some $\theta$ between min($x↓0,
\ldots , x↓n)$ and max(x$↓0, \ldots , x↓n),$ if the $n$th derivative
$f↑{(n)}(x)$ exists and is continuous. [{\sl Hint{\sl : }}Prove
that
$$$f(x↓0, x↓1, \ldots , x↓n) = \int ↑{1}↓{0}dt↓1\int ↑{t↓1}↓{0}dt↓2
\cdots \int ↑{t↓{n-1}}↓{0}dt↓nf↑{(n)}\biglp x↓0(1 - t↓1) + x↓1(t↓1
- t↓2) +\cdot \cdot \cdot + x↓{n-1}(t↓{n-1} - t↓n) + x↓n(t↓n
- 0)\bigrp .$
|qctr \6skip (This formula also defines $f(x↓0, x↓1, \ldots
, x↓n)$ in a useful manner when the $x↓j$ are not distinct.)]
If $y↓j = f(x↓j),$ show that $α↓j = f(x↓0, \ldots , x↓j)$ in
Newton's interpolation polynomial (42).

\exno 16. [M22] How can we readily compute the coefficients
of $u↑{[n]}(x) = u↓nx↑n +\cdot \cdot \cdot + u↓0,$ if we are
given the values of $x↓0, x↓1, \ldots , x↓{n-1}, α↓0, α↓1, \ldots
, α↓n$ in Newton's interpolation polynomial (15)?

\exno 17. [M46] Is there a way to evaluate the polynomial
$$$\sum ↓{1≤i<j≤n} x↓ix↓j = x↓1x↓2 +\cdot \cdot \cdot + x↓{n-1}x↓$
n|qctr \6skip with fewer than $n - 1$ multiplications and $2n
- 4$ additions? (There are (${n\atop 2})$ terms.)

\exno 18. [M20] If the fourth-degree scheme (9) were changed
to
$$$y = (x + α↓0)x + α↓1,\qquad u(x) = |newtype (|newtype (y
- x + α↓2)y + α↓3|newtype )|newtype α↓4,$
|qctr \6skip what formulas for computing the $α↓j'$s in terms
of the $u↓k'$s would take the place of (10)?

\exno 19. [M24] Explain how to determine the adapted coefficients
$α↓0, α↓1, \ldots , α↓5$ in (11) from the coefficients $u↓5,
\ldots , u↓1, u↓0$ of $u(x)$ and find the $α'$s for the particular
polynomial $u(x) = x↑5 + 5x↑4 - 10x↑3 - 50x↑2 + 13x + 60.$

\exno 20. [21] Write a MIX program which evaluates a fifth-degree
polynomial according to scheme (11); try to make the program
as efficient as possible, by making slight modifications to
(11). Use MIX's floating-point arithmetic operators.

\exno 21. [20] Find two more ways to evaluate the polynomial
$x↑6 + 13x↑5 + 49x↑4 + 33x↑3 - 61x↑2 - 37x + 3$ by scheme (12),
using the two roots of (15) which were not considered in the
text.

\exno 22. [18] What is the scheme for evaluating $x↑6 - 3x↑5
+ x↑4 - 2x↑3 + x↑2 - 3x - 1,$ using Pan's method (16)?

|qleft \6skip {\bf 23. \xskip} [$HM{\sl 30}]$ (J. Eve.) Let
$f(z) = a↓nz↑n + a↓{n-1}z↑{n-1} +\cdot \cdot \cdot + a↓0$ be
a polynomial of degree $n$ with real coefficients, having at
least $n - 1$ roots with a nonnegative real part. Let
$$
|qctr $\hfill g(z) ⊗= a↓nz↑n + a↓{n-2}z↑{n-2} +\cdot \cdot \cdot
+ a↓n$ $↓{mod} ↓2z↑n$ $↑{mod} ↑2,\cr
\4skip \hfill h(z) ⊗= a↓{n-1}z↑{n-1} + a↓{n-3}z↑{n-3} +\cdot
\cdot \cdot + a↓{(n-1)}$ $↓{mod} ↓2z↑{(n-1)}$ $↑{mod} ↑2.\cr
\6skip Assume that h(z)$ is not identically zero.
|qleft β|newtype 58
%folio 632 galley 4 (C) Addison-Wesley 1978
320---Computer Programming\quad (A.-W./Knuth)\quad ch. 4\quad
f. 632\quad g. 4D

|qleft \12skip |newtype \qquad a) Show that $g(z)$ has at least
$n - 2$ imaginary roots (i.e., roots whose real part is zero),
and $h(z)$ has at least $n - 3$ imaginary roots. [{\sl Hint{\sl
: }}Consider the number of times the path $f(z)$ circles the
origin as $z$ goes around the path shown in fig. 15, for a sufficiently
large radius $R.]$
|qleft \qquad b) Prove that the squares of the roots of $g(z)
= 0, h(z) = 0$ are all rea.

\exno 24. [M????] Find values of $c$ and $α↓k, β↓k$ satisfying
the conditions of Theorem$u(x) = (x + 7)(x↑2 + 6x + 10)(x↑2
+ 4x + 5)(x + 1).$ Choose these values so that $β↓2 = 0.$ Give
two different solutions to this problem!

\exno 25. [M20] When the construction in the proof of Theorem
M is applied to the (inefficient) polynomial chain
$$$λ↓1 |tab = α↓1 + λ↓0,\qquad λ↓2 |tab = -λ↓0 - λ↓0,\qquad
λ↓3 |tab = λ↓1 + λ↓1,\qquad λ↓4 |tab = α↓2 \times λ↓3,⊗\3skip
\hfill λ↓5 ⊗= λ↓0 - λ↓0,\hfill λ↓6 ⊗= α↓6 - λ↓5,\hfill λ↓7 ⊗=
α↓7 + λ↓6,\hfill λ↓8 ⊗= λλ↓7 \times λ↓7,\cr
\3skip$ how can $β↓1, β↓2, \ldots , β↓9$ be expressed in terms
of $α↓1, \ldots , α↓8?$

\exno 26. [M21] (a) Give the polynomial chain corresponding
to Horner's rule for evaluating polynomials of degree $n = 3.$
(b) Using the construction in the proof of Theorem A, express
$\kappa ↓1, \kappa ↓2, \kappa ↓3,$ and the result polynomial
$u(x)$ in terms of $β↓1, β↓2, β↓3, β↓4,$ and $x.$ (c) Show that
the result set obtained in (b), as $β↓1, β↓2, β↓3, β↓4$ independently
assume all real values, omits certain vectors in the result
set of (a).

\exno 27. [M22] Let $R$ be a set which includes all $(n + 1)-$tuples
$(q↓n, \ldots , q↓1, q↓0)$ of real numbers such that $q↓n ≠
0;$ prove that $R$ does not have at most $n$ degrees of freedom.

\exno 28. [HM20] Show that if $f↓0(α↓1, \ldots , α↓s), \ldots
, f↓s(α↓1, \ldots , α↓s)$ are multivariate polynomials with
integer coefficients, then there is a nonzero polynomial $g(x↓0,
\ldots , x↓s)$ with integer coefficients wuch that $g|newtype
(|newtype f↓0(α↓1, \ldots , α↓s), \ldots , α↓s), \ldots , f↓s(α↓1,
\ldots , α↓s)|newtype )|newtype = 0$ for all real $α↓1, \ldots
, α↓s. ($Hence any polynomials chain with $s$ parameters has
at most $s$ degrees of freedom.) [{\sl Hint{\sl : }}Use the
theorems about ``algebraic dependence'' which are found, for
example, in B. L. van der Waerden's {\sl Modern Algebra,} tr.
by Fred Blum (New York: Ungar, 1949), Section 64.]

\exno 29. [M20] }Let $R↓1, R↓2, \ldots , R↓m$ all be sets of
$(n + 1)-$tuples of real numbers having at most $t$ degrees
of freedom. Show that the union $R↓1 ∪ R↓2 ∪ \cdots ∪ R↓m$ also
has at most $t$ degrees of freedom.
|qleft |newtype \3skip {\bf 30. \xskip} [$M{\sl 28}]$ Prove
that a polynomial chain with $m↓1$ chain multiplications and
$m↓2$ parameter multiplications has at most $2m↓1 + m↓2 + \delta
↓{0m}|infinf 1$ degrees of freedom. [{\sl Hint{\sl : }}Generalize
Theorem M, showing that the first chain multiplication and each
parameter multiplication can essentially introduce only one
new parameter into the result set.]

\exno 31. [M23] Prove that a polynomial chain capable of computing
all {\sl monic} polynomials of degree $n$ has at least \lfloor
$n/2\rfloor$ multiplications and at least $n$ addition-subtractions.

\exno 32. [M24] Find a polynomial chain of minimum possible
length which can compute all polynomials of the form $u↓4x↑4
+ u↓2x↑2 + u↓0;$ prove that its length is minimal.

\exno 33. [M25] Let $n ≥ 3$ be odd. Prove that a polynomial
chain with $\lfloor n/2\rfloor + 1$ multiplication steps cannot
compute all polynomials of degree $n$ unless it has at least
$n + 2$ addition-subtraction steps. {\sl [Hint{\sl : }}See exercise
30.]

\exno 34. [M26] Let $λ↓0, λ↓1, \ldots , λ↓r$ be a polynomial
chain in which all addition and subtraction steps are parameter
steps, and which contains at least one parameter multiplication.
Assume that this scheme has $m$ multiplications and $k = r -
m$ addition-subtractions, and that the polynomial computed by
the chain has maximum degree $n.$ Prove that all polynomials
computable by this chain, for which the coefficient of $x↑n$
is not zero, can be computed by another chain which has at most
$m$ multiplications and at most $k$ additions, and no subtractions;
and whose last step is the only paramter multiplication.

\exno 35. [M25] Show that any polynomial chain which computes
a general fourth degree polynomial using only three multiplications
must have at least five addition-subtractions. {\sl [Hint{\sl
: }}Assume that there are only four addition-subtractions, and
show that exercise 34 applies; this means the scheme must have
a particular form which is incapable of representing all fourth
degree polynomials.]

\exno 36. [M27] Show that any polynomial chain which computes
a general sixth-degree polynomial using only four multiplications
must have at least seven addition-subtractions. (Cf. exercise
35.)

\exno 37. [M21] (T. S. Motzkin.) Show that ``almost all'' rational
functions of the form
$$$(u↓nx↑n + u↓{n-1}x↑{n-1} +\cdot \cdot \cdot + u↓1x + u↓0)/(x↑n
+ v↓{n-1}x↑{n-1} +\cdot \cdot \cdot + v↓1x + v↓0),$
|qctr \4skip with coefficients in a field $S,$ can be evaluated
using the scheme
$$$α↓1 + β↓1/|newtype (|newtype x + α↓2 + β↓2/(x +\ldots β↓n/(x
+ α↓{n+1}) \cdot \cdot \cdot )|newtype )|newtype ,$
|qctr \6skip ${fpr siotabfie α↓j, β↓j$ in $S. ($This continued
fraction scheme has $n divisions and 2n$ additions; by ``almost
all'' rational functions we mean all except those whose coefficients
satisfy some nontrivial polynomial equation.) Determine the
$α'$s and $β'$s for the rational function $(x↑2 + 10x + 29)/(x↑2
+ 8x + 19).$

\exno 38. [HM32] (A. M. Garsia, 1962.) The purpose of this exercise
is to prove that Horner's rule is really optimal if no preliminary
adaptation of coefficients is made; we need $n$ multiplications
and $n$ additions to compute $u↓nx↑n +\cdot \cdot \cdot + u↓1x
+ u↓0,$ if the variables $u↓n, \ldots , u↓1, u↓0, x,$ and arbitrary
constants are given. consider chains which are as before except
that $u↓n, \ldots , u↓1, u↓0, x$ are each considered to be variables;
we may say, for example, that $λ↓{-j-1} = u↓j, λ↓0 = x.$ In
order to show that Horner's rule is best, it is convenient to
prove a somewhat more general theorem: Let $A = (a↓{ij}), 0
≤ i ≤ m, 0 ≤ j ≤ n,$ be an $(m + 1) \times (n + 1)$ matrix of
real numbers, of rank $n + 1;$ and let $B = (b↓0, \ldots , b↓m)$
be a vector of real numbers. Prove that {\sl any polynomial
chain which computes}
$$P(x; u$↓0, \ldots , u↓n) = \sum ↓{0≤i≤m}$(a$↓{i0}u↓0 +\cdot
\cdot \cdot + a↓{in}u↓n + b↓i)x↑$
i|qctr \6skip involves at least n chain multiplications. (Note
that this does not mean only that we are considering some fixed
chain in which the parameters $α↓j$ are assigned values depending
on $A$ and $B;$ it means that both the chain {\sl and} the values
of the $α'$s may depend on the given matrix $A$ and vector $B.$
No matter how $A, B,$ and the values of $α↓j$ are chosen, it
is impossible to compute $P(x; u↓0, \ldots , u↓n)$ without doing
{\sl n ``}chainstep'' multiplications.) The assumption that
$A$ has rank $n + 1$ implies that $m ≥ n. [Hint{\sl : }$Show
that from any such scheme we can derive another which has one
less chain multiplication and which has $n$ decreased by one.]
|qleft β??????|newtype 58
%folio 635 galley 5 (C) Addison-Wesley 1978
320---Computer Programming\quad (A.-W./knuth)\quad ch. 4\quad
f. g. 5D

|qleft \12skip |newtype {\bf 39. \xskip} [M{\sl 29}] (T. S.
Motzkin, 1954.) Show that schemes of the form $w↓1 = x(x + α↓1)
+ β↓1, w↓k = w↓{k-1}(w↓1 + \gamma ↓kx + α↓k) + \delta ↓kx +
β↓k$ for $1 < k ≤ m,$ where the $α↓k, β↓k$ are real and the
$\gamma ↓k\delta ↓k$ are integers, can be used to evaluate all
monic polynomials of degree 2$m$ over the real numbers. (We
may have to choose the $\gamma ↓k$ and $\delta ↓k$ differently
for different polynomials.) Try to let $\delta ↓k = 0$ whenever
possible.
|qleft ????????????|newtype \3skip {\bf 40. \xskip} [$M{\sl
41}]$ Can the lower bound in the number of multiplications in
Theorem C be raised from $\lfloor n/2\rfloor + 1$ to $\lceil
n/2\rceil + 1?$ (Cf. exercise 33. An unfounded remark that Belaga
[{\sl Dokladi Akad. Nauk SSSR \bf 123} (1958), 775--777] gave
such an improvement has appeared several times in the literature.)

\exno 41. [22] (Oscar Buneman.) Show that the real and imaginary
parts of $(a + bi)(c + di)$ can be obtained by doing 3 multiplications
and 5 additions of real numbers.
|qleft {\bf 42. \xskip} [${\sl 36}]$ (M. Paterson and L. Stockmeyer.)
(a) Prove that a polynomial chain with $m = 2$ chain multiplications
has at most $m↑2 + 1$ degrees of freedom. (b) Show that for
all $n ≥ 2$ there exist polynomials of degree $n,$ all of whose
coefficients are 0 or 1, which cannot be evaluated by any polynomial
chain with lesss than \lfloor ??U19}$n??\rfloor$ multiplications,
if we require all parameters $α↓j$ to be integers. (c) Show
that any polynomial of degree $n$ with integer coefficients
can be evaluated by an all-integer algorithm which performs
at most $2\lfloor ??U19}n??\rfloor$ multiplications, if we don't
care how many additions we do.

\exno 43. [22] Explain how to evaluate $x↑n +\cdot \cdot \cdot
+ x + 1$ with $2l(n + 1) - 2$ multiplications and $l(n + 1)$
additions (no divisions or subtractions).
|qleft |newtype \24skip ??{\:r 4.7. \xskip MANIPULATION OF POWER
SERIES

|qleft }\12skip If we are given two power series
$$$U(z) = U↓0 + U↓1z + U↓2z↑2 +\cdots ,\qquad V(z) = V↓0 + V↓1z
+ V↓2z↑2 +\cdots ,\eqno (1)$
|qctr \6skip whose coefficients belong to a field, we can form
their sum, thier product, their quotient, etc., to obtain new
power series. A polynomial is obviously a special case of a
power series, in which there are only finitely many terms.
|qleft \quad Of course, only a finite number of terms can be
represennted and stored within a computer, so it makes sense
to ask whether power series arithmetic is even possible on computers;
and if it is possible, what makes it different from polynomial
arithmetic? The answer is that we work only with the first $N$
coefficients of the powerseries, where $N$ is a parameter which
may in principle be arbitrarily large; instead of ordinary polynomial
arithmetic, we are essentially doing polynomial arithmetic modulo
$z↑N,$ and this often leads to a seomewhat different point of
view. Furthermore, special operations, like ``reversion,'' can
be performed on power series but not on polynomials, since polynomials
are not closed under these operations.
|qleft \quad Manipulation of power series has several applications
to numerical analysis, but perhaps its greatest use is the determination
of asymptotic expansions (as we have seen in Section 1.2.11.3),
or the calculation of quantities defined by certain generating
functions. The latter applications make it desirable to calculate
the coefficients exactly, instead of with floating-point arithmetic.
All of the algorithms in this section, with obvious exceptions,
can be done using rational operations only, so the techniques
of Section 4.5.1 can be used to obtain exact results when desired.
|qleft \quad The calculation of $W(z) = U(z) \pm V(z)$ is, of
course, trivial, since we have $W↓n = U↓n \pm V↓n$ for $n =
0, 1, 2, . . .$ It is also easy to calculate $W(z) = U(z)V(z),$
using the familiar ``Cauchy product rule'':
$$$W↓n = \sum ↓{0≤k≤n} U↓kV↓{n-k} = U↓0V↓n + U↓1V↓{n-1} +\cdot
\cdot \cdot + U↓nV↓0.\eqno (2)$
|qctr \6skip \quad The quotient $W(z) = U(z)/V(z),$ when $V↓0
≠ 0,$ can be obtained by interchanging $U$ and $W$ in (2); we
obtain the rule
$$
|qctr \hfill W$↓n ⊗= \left(U↓n - \sum ↓{0≤k<n} W↓kV↓{n-k}}\left/V↓0\cr
\4skip ⊗= (U↓n - W↓0V↓n - W↓1V↓{n-1} -\cdot \cdot \cdot - W↓{n-1}V↓1)/V↓0.\eqno
(3)\cr
\6skip$ This recurrence relation for the $W'$s makes it easy
to determine $W↓0, W↓1, W↓2, . . .$ successively, without inputting
$U↓n$ and $V↓n$ until after $W↓{n-1}$ has been computed. Let
us say that a power-series manipulation algorithm with the latter
property is ``on-line''; an on-line algorithm can be used to
determine $N$ coefficients $W↓0, W↓1, \ldots , W↓{N-1}$ of the
result without knowing $N$ in advance, so it is possible in
theory to run the algorithm indefinitely and compute the entire
power series; or to run it until a certain condition is met.
(The opposite of ``on-line'' is ``off-line.'')
|qleft \quad If the coefficients $U↓k$ and $V↓k$ are integers
but the $W↓k$ are not, recurrence (3) involves computation with
fractions. This can be avoided by the all-integer approach described
in exercise 2.
|qleft \quad Let us now consider the operation of computing
$W(z) = V(z)↑|≤a,$ where $α$ is an ``arbitrary'' power. For
example, we could calculate the square root of $V(z)$ by taking
$α = {1\over 2},$ or we could find $V(z)↑{-10}$ or even $V(z)↑|≤p.$
If $V↓m$ is the first nonzero coefficient of $V(z),$ we have
$$\7skip (4)|zfa
 |qright \-7skip $V(z) |tab = V↓mz↑m\biglp 1 + (V↓{m+1}/V↓m)z
+ (V↓{m+2}/V↓m)z↑2 +\cdot \cdot \cdot \bigrp ,⊗\4skip \hfill
V(z)↑|≤a ⊗= V↑{α}↓{m}z↑|≤a↑m\biglp 1 + (V↓{m+1}/V↓m)z + (V↓{m+2}/V↓m)z↑2
+\cdot \cdot \cdot \bigrp ↑|≤a.\cr
\6skip$ This will be a power series if and only if $αm$ is a
nonnegative integer. From (4) we can see that the problem of
computing general powers can be reduced to the case that $V↓0
= 1;$ then the problem is to find coefficients of
$$$W(z) = (1 + V↓1z + V↓2z↑2 + V↓3z↑3 +\cdot \cdot \cdot )↑|≤a.\eqno
(5)$
|qctr \6skip Clearly $W↓0 = 1↑|≤a = 1.$
|qleft \quad The obvious way to find the coefficients of (5)
is to use the binomial theorem (Eq. 1.2.9--19), or (if $α$ is
a positive integer) to try repeated squaring as in Section 4.6.3,
but a much simpler and more efficient device for calculating
powers has been suggested by J. C. P. Miller. [See P. Henrici,
{\sl JACM \bf 3} (1956), 10--15.] If $W(z) = V(z)↑|≤a,$ we have
by differentiation
$$$W↓1 + 2W↓2z + 3W↓3z↑2 +\cdot \cdot \cdot = W↑\prime (z) =
αV(z)↑|≤a↑{-1}V↑\prime (z);\eqno (6)$
|qctr \6skip therefore
$$$W↑\prime (z)V(z) = αW(z)V↑\prime (z).\eqno (7)$
|qctr \6skip If we now equate the coefficients of $z↑{n-1}$
in (7), we find that
$$$\sum ↓{0≤k≤n}kW↓kV↓{n-k} = α\sum ↓{0≤k≤n} (n - k)W↓kV↓{n-k},\eqno
(8)$
|qctr \6skip and this gives us a useful computational rule,
$$
|qctr \hfill W$↓n ⊗= \sum ↓{1≤k≤n} \left(\left({α + 1\over n}}k
- 1}V↓kW↓{n-k}\cr
\4skip ⊗= \biglp (α + 1 - n)V↓1W↓{n-1} + (2(α + 1) - n)V↓2W↓{n-2}
+\cdot \cdot \cdot + nαV↓n\bigrp /n,\qquad n ≥ 1.\cr
\2skip (9)$
|qright \6skip ????????????|newtype 58320---Computer
%folio 638 galley 6 (C) Addison-Wesley 1978
Programming\quad (A.-W./Knuth)\quad ch.\quad f. 638\quad g.
6D

|qleft \12skip \quad The transformation of power series which
is perhaps of greatest interest is called ``reversion of series.''
This problem is to solve the equation
$$$z = t + V↓2t↑2 + V↓3t↑3 + V↓4t↑4 +\cdot \cdot \cdot \eqno
(10)$
|qctr \6skip for $t,$ obtaining the coefficients of the power
series
$$$t = z + W↓2z↑2 + W↓3z↑3 + W↓4z↑4 +\cdots .\eqno (11)$
|qctr \6skip Several interesting ways to achieve such a reversion
are known; we might say that the ``classical'' method is one
based on Lagrange's remarkable inversion formula [$M|spose 1emoires
Acad. royale des Sciences et Belles-Lettres de Berlin \bf 24}
(1768), 251--326], which states that
$$$W↓n = U↓{n-1}/n,$
|qctr if
|qleft $U↓0 + U↓1t + U↓2t↑2 +\cdot \cdot \cdot = (1 + V↓2t +
V↓3t↑2 +\cdot \cdot \cdot )↑{-n}.\eqno (12)$
|qctr \6skip For example, we have ($1 - t)↑{-5} = ({4\atop 4})
+ ({5\atop 4})t + ({6\atop 4})t↑2 +\cdots ;$ hence $W↓5$ in
the reversion of $z = t - t↑2$ is equal to (${8\atop 4}$)/5
= 14. This checks with the formulas for enumerating binary trees
in Section 2.3.4.4.
|qleft \quad Relation (12) shows that we can revert the series
(10) if we compute $(1 + V↓2t + V↓3t↑2 +\cdot \cdot \cdot )↑{-n}$
for $n = 1, 2, 3, \ldots .$ A straightforward application of
this idea would lead to an oλ-line power-series algorithm which
uses approximately $N↑3/2$ multiplications to find $N$ coefficients,
but Eq. (9) makes it possible to work with onl y the first $n$
coefficients of $(1 + V↓2t + V↓3t↑2 +\cdot \cdot \cdot )↑{-n},$
obtaining an on-line algorithm which has only about $N↑3/6$
multiplications.

|qleft \12skip {\bf Algorithm L} ({\sl Lagrangian power-series
reversion).} This on-line algorithm inputs the value of $V↓n$
in (10) and outputs the value of $W↓n$ in (11), for $n = 2,
3, 4, \ldots , N.$ (The number $N$ need not be specified in
advance; some other termination condition may be substituted.)

|qleft \6skip |indent {\bf L1.} [Initialize.] Set $n ← 1, U↓0
← 1. ($During the course of this algorithm, we will have
$$\qquad \biglp 1 + V$↓2t + V↓3t↑2 +\cdot \cdot \cdot )↑{-n}
= U↓0 + U↓1t +\cdot \cdot \cdot + U↓{n-1}t↑{n-1} + O(t↑n).\bigrp$
 |qctr \3skip (13)
|qright \6skip {\bf L2.} [Input $V↓n.]$ Increase $n$ by 1. If
$n > N,$ the algorithm terminates; otherwise input the next
coefficient, $V↓n.$

\algstep L3. [divide.] Set $U↓k ← U↓k - U↓{k-1}V↓2 -\cdot \cdot
\cdot - U↓1V↓k - V↓{k+1},$ for $k = 1, 2, \ldots , n - 2 ($in
this order); then set
$$\qquad $U↓{n-1} ← -2U↓{n-2}V↓2 - 3U↓{n-3}V↓3 -\cdot \cdot
\cdot - (n - 1)U↓1V↓{n-1} - nV↓{n-1}.$
|qctr \6skip \qquad \biglp We have thereby divided $U(z)$ by
$V(z)/z;$ cf. (3) and (9).|newtype )
|qleft |newtype \3skip {\bf L4.} [Output $W↓n.]$ Output $U↓{n-1}/n
($which is $W↓n)$ and return to L2.
|qleft |cancelindent |newtype {\bf Fig. 16. \xskip} Power-series
reversion by algorithm L.
|qctr \6skip |newtype When applied to the example $z = t - t↑2,$
the computation in Algorithm L is
|qleft |tab \qquad \quad |tab \qquad \quad |tab \qquad \quad
|tab \qquad ∂\quad |tab \qquad \quad |tab \qquad \quad |tab
\qquad \quad |tab \qquad \quad |tab |zfa ⊗$n⊗V↓n⊗U↓0⊗U↓1⊗U↓2⊗U↓3⊗U↓4⊗W↓n\cr
\2skip 1⊗1⊗1⊗⊗⊗⊗⊗ 1\cr
2⊗-1 ⊗1⊗1⊗⊗⊗⊗1\cr
3⊗0⊗1⊗3⊗ 6⊗⊗⊗ 2\cr
4⊗0⊗1⊗4⊗10⊗20⊗⊗ 5\cr
5⊗0⊗1⊗5⊗15⊗35⊗70⊗14\cr
\6skip$ Exercise 8 shows a slight modification of algorithm
L will solve a considerably more general problem with only a
little more effort.
|qleft \quad Let us now consider solving the equation
$$$U↓1z + U↓2z↑2 + U↓3z↑3 +\cdot \cdot \cdot = t + V↓2t↑2 +
V↓3t↑3 +\cdot \cdot \cdot \eqno (14)$
|qctr \6skip for $t,$ obtaining the coefficients of the power
series
$$$t = W↓1z + W↓2z↑2 + W↓3z↑3 + W↓4z↑4 +\cdots .\eqno (15)$
|qctr \6skip Eq. (10) is the special case $U↓1 = 1, U↓2 = U↓3
=\cdot \cdot \cdot = 0.$ If $U↓1 ≠ 0,$ we may assume that $U↓1
= 1,$ if we replace $z$ by $(U↓1z);$ but we shall consider the
general equation (14), since $U↓1$ might equal zero.

|qleft \12skip {\bf Algoritm T} ({\sl General power-series reversion).}
This on-line algorithm inputs the values of $U↓n$ and $V↓n$
in (14) and outputs the value of $W↓n$ in (15), for $n = 1,
2, 3, \ldots , N.$ An auxiliary matrix $T↓{mn}, 1 ≤ m ≤ n ≤
N,$ is used in the calculations.

|qleft \6skip |indent {\bf T1.} [Initialize.] Set $n ← 1.$ Let
the first two inputs (namely, $U↓1$ and $V↓1)$ be stored in
$T↓{11}$ and $V↓1,$ respectively. (We must have $V↓1 = 1).$

\algstep T2. [Output $W↓n.]$ Output the value of $T↓{1n} ($which
is $W↓n).$
|qleft {\bf T3.} [Input $U↓n, V↓n.]$ Increase $n$ by 1. If $n
> N,$ the algorithm terminates; otherwise store the next two
inputs (namely $U↓n$ and $V↓n)$ in $T↓{1n}$ and $V↓n,$ respectively.

\algstep T4. [Multiply.] Set
$$\qquad $T↓{mn} ← T↓{11}T↓{m-1,n-1} + T↓{12}T↓{m-1,n-2} +\cdot
\cdot \cdot + T↓{1,n-m+1}T↓$
{m-1,m-1}|qctr \6skip \qquad and $T↓{1n} ← T↓{1n} - V↓mT↓{mn},$
for $2 ≤ m ≤ n.$ (After this step we have the conditions
$$$t↑m = T↓{mm}z↑m + T↓{m,m+1}z↑{m+1} +\cdot \cdot \cdot + T↓{mn}z↑n
+ O(z↑{n+1}),\qquad 1 ≤ m ≤ n.$
|qctr \2skip (16)
|qright \6skip \qquad It is easy to verify (16) by induction
for $m ≥ 2,$ and when $m = 1,$ we have $U↓n = T↓{1n} + V↓2T↓{2n}
+\cdot \cdot \cdot + V↓nT↓{nn}$ by (14) and (19).) Return to
step T2.
|qleft |cancelindent \12skip \quad Equation (16) explains the
mechanism of this algorithm, which is due to Henry C. Thacher,
Jr. [{\sl CACM \bf 9} (1966), 10--11]. The running time is essentially
the same as Algorithm L, but considerably more storage space
is required. An example of this
%folio 643 galley 7 (C) Addison-Wesley 1978
algorithm is worked in exercise 9.
|qleft ?????????|newtype 58320---Computer Programming\quad (A.-W./Knuth)\quad
ch. 4\quad f. 643\quad g. 7D

|qleft \12skip \quad Still another approach to power series
reversion has been proposed by R. P. Brent and H. T. Kung [{\sl
Analytic computational complexity,} ed. by Traub (New York:
Academic Press, 1975), 000--000], who observed that standard
iterative procedures used to find roots of equations over the
real numbers can usefully be applied also to equations over
power series. In particular, consider Newton's method for computing
approximations to a real number $t$ such that $f(t) = 0,$ given
a function $f$ that is well-behaved near $t:$ If $x$ is a good
approximation to $T,$ then $\phi (x) = x - f(x)/f↑\prime (x)$
will be even better, for it we write $x = t + \cdot$ we have
$f(x) = f(t) + \cdot f↑\prime (t) + O(\cdot ↑2), f↑\prime (x)
= f↑\prime (t) + O(\cdot ),$ and $\varphi (x) = t + \cdot -
\biglp 0 + \cdot f↑\prime (t) + O(\cdot ↑2)\bigrp /\biglp f↑\prime
(t) + O(\cdot )\bigrp = t + O(\cdot ↑2).$ Applying this idea
to power series, let $f(x) = V(x) - U(z),$ where $U$ and $V$
are the pwer series in Eq. (14). We wish to lnd the power series
$t$ in $z$ such that $f(t) = 0.$ Let $x = W↓1z +\cdots + W↓{n-1}z↑{n-1}
= t + O(z↑n)$ be an ``approximation'' to $t$ of order $n;$ then
$\varphi (x) = x - f(x)/f↑\prime (x)$ will be an approximation
of order $2n,$ since the assumptions of Newton's method hold
for this $f$ and $t.$
|qleft \quad We therefore can use the following procedure:
$${\bf Algorithm N} $(General power series reversion by Newton's
method). \xskip$ This ``semi-on-line`` algorithm inputs the
values of $U↓n$ and $V↓n in (14)$ for $2↑k ≤ n > 2↑{k+1}$ and
then outputs the values of $W↓n$ in (15) for $2↑k ≤ n > 2↑{k+1},$
thereby producing its answers in batches of $2↑k$ at a time,
for $k = 0, 1, 2, \ldots , K.$

|qleft \6skip |indent {\bf N1.} [Initialize.] Set $N ← 1. ($We
will have $N = 2↑k.)$ Input the first coefficients $U↓1$ and
$V↓1$ (where $V↓1 = 1),$ and set $W↓1 ← U↓1.$

\algstep N2. [Output.] Output $W↓n$ for $N ≤ n > 2N.$

\algstep N3. [Input.] Set $N ← 2N.$ If $N > 2↑K,$ the algorithm
terminates; otherwise input the values $U↓n$ and $V↓n$ for $N
≤ n < 2N.$

\algstep N4. [Newtonian step.] Use an algorithm for power series
composition (see exercise 11) to evaluate the coefficients $Q↓j,
R↓j (0 ≤ j < N)$ in the power series
$$
|qctr \qquad U$↓1z +\cdot \cdot \cdot + U↓{2N-1}z↑{2N-1} - V(W↓1z
+\cdot \cdot \cdot + W↓{N-1}z↑{N-1})$

|qleft \4skip = R$↓0z↑N + R↓1z↑{N+1} +\cdot \cdot \cdot + R↓{N-1}z↑{2N-1}
+ O(z↑{2N}),\cr
\6skip V↑\prime (W↓1z +\cdot \cdot \cdot + W↓{N-1}z↑{N-1}) =
Q↓0 + Q↓1z +\cdot \cdot \cdot + Q↓{N-1}z↑{N-1} + 0(z↑N),\cr
\6skip \qquad$ where $V(x) = x + V↓2x↑2 +\cdot \cdot \cdot$
and $V↑\prime (x) = 1 + 2V↓2x +\cdots .$ Then set $W↓N, \ldots
, W↓{2N-1}$ to the coefficients in the power series
$$$\qquad {R↓0 + R↓1z +\cdot \cdot \cdot + R↓{N-1}z↑{N-1}\over
Q↓0 + Q↓1z +\cdot \cdot \cdot + Q↓{N-1}z↑{N-1}} = W↓N + W↓{N+1}z
+\cdot \cdot \cdot + W↓{2N-1}z↑{N-1} + O(z↑N)$
|qctr \6skip \qquad and return to step N2.

|qleft \6skip |cancelindent \quad The running time for this
algorithm to obtain the coefficients up to $N = 2↑K$ is $T(N),$
where
$$$T(2N) = T(N) + ($time to do step N4 + $O(N).\eqno (17)$
|qctr \6skip Straightforward algorithms for composition and
division in step N4 will take order $N↑3$ steps, so Algorithm
N will run slower than Algorithm T. However, Brent and Kung
have found a way to do the required composition of power series
with $O(N$ log $N)↑{3/2}$ arithmetic operations, and exercise
6 gives an even faster algorithm for division; hence (17) shows
that power series reversion can be achieved by doing only $O(N$
log $N)↑{3/2}$ operations as $N → ∞.$ (On the other hand the
constant of proportionality is such that $N$ must be really
large before Algorithms L and T lose out to this ``high-speed''
method.)
|qleft {\sl Historical note{\sl : }}J. N. Bramhall and M. A.
Chapple published the first $O(n↑3)$ method for power series
reversion [{\sl CACM \bf 4} (1961), 317--318, 503]; it was an
off-line algorithm whose running time was approximately the
same as Algorithms L and T.

|qleft \24skip {\:r EXERCISES

|qleft }\12skip |newtype {\bf 1.} [$M{\sl 10}]$ The text explains
how to divide $U(z)$ by $V(z)$ when $V↓0 ≠ 0;$ how should the
division be done when $V↓0 = 0?$

\exno 2. [20] If the coefficient of $U(z)$ and $V(z)$ are integers
and $V↓0 ≠ 0,$ a recurrence relation for the integers $V↑{n+1}↓{0}W↓n,$
where $W↓n$ is defined by (3). How would you use this for power
series division?

\exno 3. [M15] Does formula (9) give the right results when
$α = 0?$ When $α = 1?$

\exno 4. [HM23] Show that simple modifications of (9) can be
used to calculate $e↑{V(z)}$ and ln|newtype (|newtype 1 + $V(z)|newtype
)|newtype ,$ when $V(z) = V↓1z + V↓2z↑2 +\cdots .$

\exno 5. [M00] What happens when a power series is reverted
twice; i.e., if the output of algorithm R or S is reverted again?

\exno 6. [M21] (H. T. Kung.) Apply Newton's method to the computation
of $W(z) = 1/V(z),$ when $V(0) ≠ 0,$ by finding the power-series
root of $f(x) = x↑{-1} - V(z).$

\exno 7. [M23] Use Lagrange's inversion formula (12) to find
a simple expression for the coefficient $W↓n$ in the reversion
of $z = t - t↑m.$

\exno 8. [M25] Lagrange's inversion formula can be generalized
as follows: If $W(z) = W↓1z + W↓2z↑2 +\cdot \cdot \cdot = G↓1t
+ G↓2t↑2 + G↓3t↑3 +\cdot \cdot \cdot = G(t),$ where $z$ and
$t$ are related by Eq. (10), then $W↓n = U↓{n-1}/n$ where
$$$U↓0 + U↓1z + U↓2z↑2 +\cdot \cdot \cdot = (G↓1 + 2G↓2z + 3G↓3z↑2
+\cdot \cdot \cdot )(1 + V↓2z + V↓3z↑2 +\cdot \cdot \cdot )↑{-n}.$
|qctr \6skip (Equation (12) is the speical case $G↓1 = 1, G↓2
= G↓3 =\cdot \cdot \cdot = 0.$ This equation can be proved,
for example, by using tree-enumeration formulas as in exercise
2.3.4.4--33.)
|qleft \qquad Ecxtend algorithm L so that it obtains the coefficients
$W↓1, W↓2, \ldots ,$ in this more general situation, without
substantially increasing its running time.

|qleft \3skip {\bf 9.} [${\sl 11}]$ Find the values of $T↓{mn}$
computed by Algorithm T as it determines the first five coefficients
in the reversion of $z = t - t↑2.$

\exno 10. [M20] given that $y = x↑|≤a + a↓1x↑{α+2} +\cdots ,
α ≠ 0,$ show how to compute the coefficients in the expansion
$x = y↑{1/|}≤a + b↓2y↑{2/|}≤a + b↓3y↑{3/|}≤a +\cdots .$

\exno 11. [M25] Let
$$$U(z) = U↓0 + U↓1z + U↓2z↑2 +\cdot \cdot \cdot \quad$ and\quad
$V(z) = V↓1z + V↓2z↑2 + V↓3z↑3 +\cdots ;$
|qctr \6skip design an algorithm which computes the first $N$
coefficients of $U|newtype (|newtype V(z)|newtype )|newtype
.$

\exno 12. [M20] Find a connection between polynomial division
and power-series division: Given polynomials $u(x)$ and $v(x)$
of respective degrees $m$ and $n$ over a field, show how to
find the polynomials $q(x), r(x)$ such that $u(x) = q(x)v(x)
+ r(x),$ deg$(r) < n,$ using only operations on power series.

|qleft \24skip ????{\sl |newtype And it shall be,
|qright when thou hast made an end of reading this book,
|qright that thou shalt bind a stone to it,
|qright and cast it int??\quad ??\quad ??\quad ??\quad ??\quad
??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad
??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad
??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad
??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad
??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad ??\quad
??\quad ??\quad ??\quad ??\quad